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ABSTRACT 

Latent trait models are presented that can be used 
for test design in the context of a theory about the variables that 
underlie task performance. Examples of methods for decomposing and 
testing hypotheses about the theoretical variables in task 
.performance are given « The methods can be used to determine the 
processing components that are involved in item performance. Three 
component latent trait models for underlying theoretical variables 
are described along with their maximum likelihood estimators. The 
item parameters can be used for item banking according to the 
influence of the underlying processing variables on item difficulty. 
Such estimators permit the test developer to choose items that 
represent specified information processing demands for the examinee. 
In this manner, what is measured by an aptitude test can be 
explicitly designed by specifying difficulty levels in the underlying 
processing components. The need for meta component latent trait 
models was also considered. It was shown that both items and persons 
vary on metacomponent parameters and that these parameters are 
important for the predictive validity of an aptitude test. The main 
conclusion to be drawn from these studies is that metacomponent 
latent trait mod Is are needed to estimate more fully the processing 
abilities that underlie aptitude. (PN) 
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Research on aptitude tests has changed considerably in the last decade* 
The infusion of cognitive psychology into aptitude research has revitalized the 
field. Research on the cognitive components of aptitude (Carroll, 1976; Pel- 
legrino & Glaser, 1979; IL Sternberg, 1977), as well as the cognitive correlates 
of aptitude (Hunt, Lunneborg, & Lewis, 1975), not only has changed the content 
of aptitude theory but also has influenced the type of data that is deemed rele- 
vant* 

Cognitive psychology differs markedly from psychometrics on the role of the 
stimulus in task performance. Cognitive psychology experiments often employ 
wi thin-subjects factorial designs in which stimuli are systematically manipu- 
lated to represent different levels of theoretical variables. Other theoretical 
variables that could influence performance are either held constant over the set 
or counterbalanced to eliminate bias. These experiments are like psychological 
tests m that many problems of a single task type are presented. However, the 
goal is to decompose the stimulus factors in the task that influence perfor- 
mance. 

Cognitive component analysis of aptitude seeks to decompose the factors 
that influence performance on aptitude test items. A wide variety of the item 
types that appear on popular tests have been studied experimentally. For exam- 
ple, linear syllogisms (Sternberg & Weil, 1981), series completions (Butter- 
field, in press), and spatial problems (Pellegrino, Mumaw, & Cantony, in press), 
as well as many other item types, have been studied in recent research on cogni- 
tive components. The factors that have been identified on these tasks include 
the processes, strategies, and knowledge stores that underlie performance. 

Cognitive component decomposition of aptitude offers a new approach to psy- 
chological measurement. This approach is test design, in which the qualities 
that are meas\ired by a test are operationalized by the design of the test stimu- 
li. That is, just like an experimenter who designs tasks to test hypotheses, an 
item writer manipulates the stimulus features of an item to represent specified 
theoretical constructs. Test design may be applied to many substantive areas 
and linked directly to psychometrics (see Embretson, in press-b). 

The test design approach involves qualitatively different assumptions about 
the nature of construct validation research. Traditionally, the construct va- 
lidity of a measure is assessed through the relationship of individual differ- 
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ences on the test to other measures* Recently, Embretson (in press-a) has elab-* 
orated on two separate goals in construct validation research-construct repre-- 
sentation and nomothetic span* Embretson (in press'^a) hypothesized that the 
shift of psychological research to structuralism permits construct representa** 
tion to be studied separately from nomothetic span. In Embretson* s (in press-a) 
conceptualization of construct validity, construct representation is assessed 
from task decomposition data, while nomothetic span is assessed from individual 
differences data. That is, the theoretical constructs that are represented in 
performance may be studied independently from the utility of the test as a mea- 
sure of individual differences. Thus, the construct validity of the test de- 
pends, in part, on the represe- .ation of the underlying constructs in the item 
task. 

The goal of the current paper is to present three latent trait models that 
can be used for test design. Estimating the parameters for these models depends 
on applying a method for task decomposition. Thus, prior to presenting the la- 
tent trait models, two methods for task decomposition will be presented, along 
with examples that illustrate their relevance for test design. Then, the three 
latent trait models will be presented. These are (1) the linear logistic latent 
trait model (Fischer, 1973); (2) the multicomponent latent trait model (Whitely, 
I980d); and (3) the general component latent trait model (Whitely, 1980a). The 
latter is a generalization that includes the other two models. Last, the need 
for more complex latent trait models to fully assess the Important cognitive 
components of aptitude will be examined. That is, the potential contribution of 
metacomponent latent trait models to test validity will be explored. 

Methods for Task Decomposition 

Methods for task decomposition are a major tool in contemporary research on 
cognitive components. The methods that are applied to decompose tasks may also 
be applied to the design for test stimuli. Two popular methods for task decom- 
position are (1) the method of complexity factors and (2) the method of sub- 
tasks. An example of how task decomposition methods can be used for test design 
will be presented for each method. 

In the method of complexity factors, each item is manipulated and/ or scored 
on one or more factors that represent the item's position on underlying theoret- 
ical variables. This method has been applied to attitude and personality items 
(Cliff, 1977; Cliff, Bradley, & Girard, 1973), as well as to a wide variety of 
cognitive tasks, such as linear syllogisms, geometric analogies, series comple- 
tion problems, and spatial rotation items. 

Figure 1 presents an example of a geometric analogy (Whitely & Schneider, 
1981) that represents the method of complexity factors. Two processing events 
have been indicated as having major influence on task difficulty (Mulholland, 
Pellegrino, & Glaser, 1980; Whitely & Schneider, 1981). These are (1) encoding 
complexity, which depends on the number of elements in the A term in the analogy 
and (2) transformational complexity, which depends on the number of transforma- 
tions that are required to convert A to B. In Figure 1 the A term contains two 
elements (the triangle and the line) and the A to B conversion requires three 
transformations (a shape change of the external element, an increase in the man- 
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ber of internal elements and a 90® rotation of the internal elements). Whitely 
and Schneider (1981) found that two different types of transformation had op- 
posing influence on item difficulty. Distortions (change in shape or number) 
were positively related to accuracy, while displacements (rotations) were nega- 
tively related to accuracy* 

Figure 1 

A Geometric Analogy, Similar to an Item 
on the Cognitive Abilities Test 
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These findings indicate that the test developer can control item difficulty 
by systematically varying the number of elements and the number and type of 
transformations in the item stimuli. An easy item would have one or two ele- 
ments and a distortion transformation. A difficult item would have several ele- 
ments and one or more displacement transformations. Thus, the test developer 
can fashion items to achieve desired levels of difficulty. 

In contrast to the method of complexity factors, the method of subtask re- 
sponses requires the theoretical variables to be identified from a series of 
subtasks that have been constructed from the items. Table 1 presents a verbal 
analogy item that if\ similar to items on the verbal section of the Cognitive 
Abilities Test. The total item, as presented on the test, is given at the top. 
TWu components that have been supported by previous experimental research on 
verbal analogies are Rule Construction and Response Evaluation (Pellegrino & 
Glaser, 1979; Whitely, 1980c; Whitely & Barnes, 1979). These are represented by 
the two subtasks in Table 1. Notice that although Response Evaluation is se- 
quentially dependent on Rule Construction, supplying the rule in the subtask 
makes possible independent assessment of these components. Thus, for each item, 
examinees respond to the total item as well as to the subtasks that represent 
processing components. 

By using the psychometric models to be described below, item difficulty on 
the components underlying the subtasks can be calibrated on a common scale. 
Figure 2 presents a scatterplot of the item parameters on the two components. 
It can be seen that item difficulty on the two components is not highly related. 
ThuSp it is possible to design tests that reflect predominantly the influence of 
one component or the other. For example, items that are easy on Response Evalu- 
ation but difficult on Image Construction would measure abilities on the latter. 
The test developer could select the items in the lower right corner to meet this 
specification. 
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Table 1 

Subtask Set for Verbal Analogy Components 



Total Item 

Cat : Tiger :: Dog : 

(a) Lion (b) Wolf (c) Bark (d) Puppy (e) Horse 
Rule Construction 

Cat : Tiger : : Dog : 

Rule ? 
Response Evaluation 

Cat : Tiger :: Dog : 

(a) Lion (b) Wolf (c) Bark (d) Puppy (e) Horse 
Rule: A large or wild canine 



Figure 2 

Scattergram of Image Construction Difficulty by Reeponse 
Evaluation Difficulty for 45 Verbal Classification Items 
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Desiderata for Psychometrics for Test Design 

The indices that are derived from classical test theory or latent trait 
theory do not reflect the stimulus properties of a test item with respect to 
specified factors. There are several desiderata for test theory models that can 
be applied to test design. First, the method must be capable of testing hypoth- 
eses about the specification factors* Obviously, a viable specification system 
is one that is highly related to Item difficulty. However, hypothesis testing 
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about the factors in items is also crucial to establishing a theory of the item 
task. An item specification system is implicitly a theory of the task so that 
it should be evaluated by the hypo thesis-'tes ting methods that are applied to 
other theories. Second, the model must have parameters to describe the diffi- 
culty of the items on the underlying factors. The unidimensional latent trait 
models that are popular in test development do not have this property, since the 
items are calibrated for only one dimensionr-the largest common factor in the 
items. A model that allows designation of the difficulty factors according to 
an a priori specification is required. Third, measurements of persons must be 
included in the model. The need for person measurements is self-evident, since 
the goal of aptitude testing is to measure individual differences. Fourth, the 
model should specify the relationship between the item parameters and the person 
abilities. Optimally, the test design approach involves selecting from a call* 
brated item bank for a certain measurement goal. It is essential that the in^ 
fluence of item parameters on person abilities is well specified in the model. 

Component Latent Trait Models 

This section presents three component latent trait models that can be used 
to test hypotheses about construct representation and to assess factom for ttst 
design. These are (1) the linear logistic latent trait model, (2) the multicom- 
ponent latent trait model, and (3) the general component latent trait model. 
The latter, a generalization of the other two, can handle more complex data 
auout cognitive processes. 

The Linear Logistic Latent Trait Model 

The model . The linear logistic latent trait model (LLTM) is a unidimen- 
sional model in which components are identified from item scores on complexity 
factors that are postulated to determine item difficulty. To understand how 
components are identified, consider the geometric analogy presented in Figure 1, 
which is similar to items on the nonverbal section of the Cognitive Abilities 
Test. A recent study (Whitely & Schneider, 1981) compared three cognitive mod- 
els of geometric analogies, using the LLTM. All three models specify complexity 
factors in processing the item that influence response difficulty. 

The scores of the items on the complexity factors identify the components 
in an LLTM. The model can be examined by considering three equations. The 
first equation is the mathematical model for task processes. Here, a linear 
model of the complexity factors > c. , multiplied by their difficulty, r. , pre- 
dicts item difficulty, b*. ^ 

b* « Ec^ n + d , fll 
i m im m 

where 

^im " complexity of factor m in item i^; 

» the difficulty of complexity factor m; and 

d " a normalization constant. 
The second equation presents the latent trait model for individual differences, 
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which is -the Basch latent trait model, 

(e^-b^) 

1 + e J ^ 



where Gj " ability for person j_ and b^ - difficulty for Item 1^. 

Equation 3 combines these two models to give the LLIM as follows: 



(^J-(S^lm\-^^)^ 

P(x =lle ,^.d) = (9.-(Ec. n f d)) f3l 

imiinm 
1 + e ^ 

If the number of complexity factors equals the number of items, and each 
item contains only one complexity factor, then the LLIM is equivalent to the 
Fasch latent trait model. When the number of is less than the number of 

items, the LLIM is a linearly constrained model of item difficulty. 

A major advantage of the LLTM is the possibility of comparing alternative 
models of item difficulty by difference tests based on the log likelihood of 
the data, given the model. For example, the fit of any restricted model of the 
task components can be compared to the fit of the Rasch model, which can be re- 
garded as a saturated model of item difficulty. Rirther, if alternative models 
of task components are hierarchically nested, then direct comparisons between 
the models are also possible. Thus, hypothesis testing to establish a valid 
model of the task complexity factors is an important capability of the LLIM. 

Another important aspect of the model is that of parameters describing each 
item by component complexity rather than Just item difficulty. These parameters 
can be useful in item banking, so that the contribution of a processing complex- 
ity factor to each item is systemically specified. Notice, however, that the 
model is unidimensional, since only one ability parameter is specified for each 
person. 

Estimation . Fischer (1973) derived conditional maximum likelihood estima- 
tors for the item parameters of the LLTM, n^^. Although conditional maximum 

likelihood estimators are statistically superior to unconditional estimators for 
several reasons (Fischer, 1981), they are Impractical for large sets (I > 60). 
Thus, unconditional maximum likelihood estimators are useful for LLTM itea pa- 
rameters. 

The first derivative of the log likelihood function for unconditional maxi- 
mum likelihood estimation is 



[4] 
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Thissen (1981) has shown that the first derivative of the log likelihood func- 
tion L for the LLTM with respect to n may be obtained from 

m 

Combining Equation 4 with Equation 5 gives the first derivative of the uncondi- 
tional log likelihood function with respect to n 



an 



m 



[6] 



where b^is defined as in Equation !• 

Mul ticomponent Latent Trait Model 

The model . The multicomponent latent trait model (MLIM) is a multidimen- 
sional model in which components are identified from subtasks that represent the 
processing components in item solving. Like many information processing models 
for complex tasks (e.g.. Hunt, 1976), it is assumed that information from sever- 
al component events is required to solve the item. The relationship between the 
component events may be either (1) independent, where the processing or outcome 
of one event does not influence any other event, or (2) sequentially dependent, 
where information from a component event provides prerequisite information for 
processing on later events. 

A MLTM uses subtask data to Identify the components. The mathematical mod- 
el of processes in the MLTli links the component responses to the total item. 
Equation 7 presents a mathematical model for independent components in which the 
response probability for the total item is the product of the component likeli- 
hood: 

P(x^j^=l) = a nP(x^j^=l) + g [l-nP(x^j^=l)] . (7] 
where 

P(x^.j«l) *■ the probability that the composite task is correct for person 
on item i_, 

P(x. .,■1) ■ the probability that the subtask for component^ is correct 
for person ^ on item i^, 

a ■ the probability that an item is solved when the component 
information is available, and 

g ■ the probability of solving an item when the component infor- 
mation is not available. 

Unlike the original MLTM, the model includes parameters for application of 
the component information, a, which represents metacomponent or executive func- 
tioning, and for an alternacive method for solving the item, ^, such as guessing 
or rote association to the stem* Other mathematical models are possible (e.g., 
Whitely, 1980d), but all models relate the component likelihoods to the full 
i^iem likelihood. 
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As for the LLTM, the latent trait model for Individual differences Is the 
1-parameter logistic latent trait models as presented In Equation 2. However, 
In the MLIM the latent trait models are given for component subtask r*;sponses 
rather than for the total Item. The MLTM specifies that responses to the sub-- 
tasks depend on the ability of person J_ on component k and the difficulty of 
Item _1 on component jc, as follows: 

^(^ijk==i|^jk>^k) TeT^ ' [8] 

1 + e 



where O^j^ - the ability of person J[ on component k and b^j^ « the difficulty of 

Item 1 on component k. The LLTM, In contrast, Is a latent trait model for re- 
sponses to the total Item and does not model coi^ponent responses. 

The full model, presented In Equation 9, combines the latent trait model 
with the mathematical model. It can be seen that the total Item response is 
conditional on K component abilities as well as on K component Item dlfflcul-* 
ties. 

P(xy^«l|ej,b^) = (a-g)!;-^— ^— g . t9l 

1 + e 

where 9^ » the vector of k component abllltleb for person and b^^ « the vector 
of jc and component difficulties for Item _i. 

Although typical test data (with the notable exception of linked items with 
a common stem) only occasionally assess subtask responses, there are several 
reasons why it may be useful to obtain such data as part of test development. 
First, the various processing components from which information is required for 
item solution are theoretically distinct. Experimental cognitive research has 
supported the independence of components within a task by additive factor (S. 
Sternberg, 1969) or subtractive factor (Pachella, 1974) modeling methods. Sec- 
ond, if the components are sufficiently elementary, they should generalize 
across tasks and possibly account for differential patterns of correlations in 
performance on separate types of items (Carroll, 1974). Third, individual dif- 
ferences on different components correlate only moderately and show differential 
validity in predicting performance on other tasks (R. Sternberg, 1977; Whitely, 
1981). Fourth, component difficulties are sometimes not highly correlated in 
item sets, so that it is possible to select items of the same type that measure 
different component abilities. Consider for example, a two-component item, such 
as presented in Table 1. If items are so easy on one component that nearly 
everyone has a high probability of executing it correctly, then it can be shown 
that the likelihood of correctly answering the items is well described by the 
regression of the response likelihoods on the other component ability (Whitely. 
1981). 
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The relationship of the MLiM parameters to the joint response to the compo- 
nents, C^, and the total Item T, is explicated more completely by considering 
the probability sample space. There are 2^**"^ posiiible response patterns. Table 
2 shows the eight response patterns for a two-component item, along with an ex- 
pression for the probability of the pattern from the MLTM. It can be seen that 
the jS and j[ parameters link the component response to, the total item, while the 
other symbols represent the probability of the component response patterns, 
which vary systematically over persons and items, according to the 1-parameter 
logistic latent trait model* 

Estimation . No estimators were developed in Whitely (1980d) for the MLTM. 
However, given a probability space such as specified in Table 2, the likelihood 
of any response pattern is given by 



Jip \ 



[10] 



where 



Xj^ " vector of responses of person j[ to components for item i, 
Xj response of person to total item i, and 

Notice that the entry of the parameters ja and ^ into the likelihood depends on 
the value of ttj^xj^ and that tTj^j^ equals 1.0 only if all component outcomes are 
correct. Note also that a contributes to the log likelihood only if all the 
components are executed correctly, while £^ contributes when at least one compo- 
nent is incorrect. This pattern is specified in Table 2. 

The likelihood of the data set can be obtained by multiplying the response 
likelihoods over persons and items. Since neither a nor £ vary over persons or 
items, it can be concluded Immediately from well-known theorems on the binomial 
distribution that their maximum likelihood estimators are the relative frequen- 
cies 



EE 
ji 



(nx ) 
k k ^ 



and 



EE nx 
ji k k 



EE (1 - nx ) x_ 
ji kk ^ 

EE (1 - nx ) 
ji k k 



(111 



[12] 



Thus, a is given by the relative frequency of correctly answering the item when 
all components are executed correctly, while £ is given by the relative frequen- 
cy of correctly answering the total item when at least one component is executed 
incorrectly. 
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Table 2 

Frequencies and Conditional Probabilities for 
Joint Response Patterns on Verba l Analogies 

T f P(xt " M^t;.) Notation 

1 1 1 1864 .84 a Px^^X2 

1 1 0 351 .16 W 

0 1 1 518 .50 8 Qx^Px2 

0 1 0 518 .50 <l-g) Qx^^X2 

1 0 1 84 .45 8 

1 0 0 101 .45 (1-g) Px^^X2 

0 0 1 87 .28 g Qx^Qx^ 

0 0 0 221 .72 (1-g) Qx Qx 

Xj_ X2 



\'^<^ijk=^>^jk'^k^ 
p ^ e ^ 

\ , ^'jk-^k^ 

1 + e 



p(3^=i)=anp + g[i-np^ ] 



} 
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The required derivative of the log likelihood for unconditional maximum 
likelihood estimation of the item parameters b, is 



Setting the derivative to zero leads to the well-known equations for uncon- 
ditional maximum likelihood estimation of the 1-parameter logistic latent trait 
model (cf. Lord & Novick, 1968). As for other exponential families of distribu- 
tions, estimation equations for the latent trait model can be obtained by equat- 
ing the observed sufficient statistics with their expectancies, given the param- 
eters (Andersen, 1980). In the current development, however, estimation re- 
quires I equations for each of K components of the MLIM. Notice that the item 
parameters for each component b^j^ involve only the responses to the relevant 

subtask data, x^j^. It can be seen that unconditional maximum likelihood esti- 
mators may be obtained independently from each subtask to maximize the leg like- 
lihood of the joint response pattern Xj^, x^. 

A General Multifactor Latent Trait Model 

The model . The preceding developments have shown that the LLTM and the 
MLTM differ substantially in component identification. LLTMs estimate difficul- 
ty of complexity factors that are related to item difficulty, while MLTMs esti- 
mate item and person parameters for component outcomes. The different methods 
of component identification make possible a meaningful unification of these two 
models. 

Consider the verbal analogy that is presented in Table 1. This analogy was 
presented previously with the MLTM. Although the Response Evaluation component 
is identical to the previous example, the Rule Construction component is postu- 
lated to be influenced by the several processing complexity factors, c^j^, that 

are listed in Table 3. These factors concern the difficulty of inferring the 
target relationship (i.e., Plst: Clench). The factors c^jj and c^j2 ^^e 

ease of inferring the target relationship in the initial encoding of the rela- 
tional pair and in the context of the unmatched term "Teeth," respectively. 
I^evious research on analogies (R. Sternberg, 1977) as well as research in memo- 
ry organization (Reitman, 1965) suggest that relational span is also positively 
related to item solving, since extraneous relationships can interfere with solv- 
ing the analogy. In the current example, the factors c^^^ and c^j,2 are measured 

by the mean number of relationships that are educed between the word pair when 
presented alone and in the context of the unmatched term, respectively. The 
factors c^j^ to c^^j represent the relative frequency of various types of con- 
text effects in inferring the target relationship (i.e., selecting or combining 
initial relationships, inferring new relationships, and so forth). 

Scores for each item on the complexity factors were obtained from other 
research studies oa analogies (Embretson & Curtright, 1981). However, it is 
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Table 3 

Complexity Factors in Analogical Reasoning Components 



Rule Construction Component 
Fist:Clench: :Teeth: ___ 
Rule? 



^ilan * complexity of factor m for component k on item JL 

c - inference elicitation, the probability that the target 

relationship is educed from initial word pair 
^il2 * r^ational network span, the number of relationships educed 

c - inference contextualization, the probability that the target 
relationship is educed in context of all three stem stimuli 



c - c ,^ » type of contextualization effect 
ilT il7 

Response Evaluation Component 
Fist:Clench: :Teeth: 

(1) Pull (2) Brush (3) Grit (4) Gnaw (5) Jaw 



Rule: Angry reaction done with "teeth." 



important to note that in this example the complexity factors have effects on 
the, component information outcomes rather than on the total item response. 

Equation 14 presents a model that specifies both processing complexity fac- 
tors and processing component outcomes^ 



1 + e 



(e,,-(5:c. ,n t + d, )) 

jk m imk mk k 



[14] 



As for the MLTM the probability of the correct response to the intact ^tem 
ia conditional on a vector of component abilities, 6^^, and component item diffi*- 

cult.tes. However, item difficulty for each component is determined by a lin- 
ear model of the complexity factors for the component Cj^j^jj* 

Equation 14 is a general multifactor latent trait model (6LTM) for response 
prCwiisses. If only one information outcome is measured for the item (i.e., the 
response to the intact tesu item), then the model is identical to the LLTM. In 
this case, complexity factors would be scored for the total item. The parame- 
ters a, and ^ drop out of the model, since the response to the total item vould 
be given by response to the single component outcome that is observed. If no 
complexity factors postulated for each component outcome, but several component 
outcomes are observed, then the model is MLIM. In this case, the Rasch model is 
specified for each component. However, for tasks with multiple information out- 
comes and processing complexity factors that influence these outcomes, the full 
model can be utilized. 
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Estimation . The likelihood of the joint response pattern, Xj^, x,j,, given 

the parameters of the model. Is given by Equation 10, except that the component 
likelihoods are given by the LLIM for the component as follows: 

The estimators of a^ and ^ for the GLTM are the same au for MLTM, as given 
In Equation 11 and Equation 12. Since Item difficulty is linearly constrained 
within a component for the LLTM, the first derivative of the log likelihood 
function with respect to n^j^ is required for unconditonal maximum likelihood 
estimation of the items. Using the development given above for LLIM, it can be 
seen that the first symbolic partial derivative with respect to n^j^ in GLTM is 

1 

A Fortran program, MULTICOMP (Whltely & Nieh, 1981) is available to estimate the 
parameters of the GLTM. 

Future Directions: Metacomponent Latent Trait Models? 

The component models that were presented above do not fully reflect the 
complexity of the Information processes that are involved in task performance. 
Metacomponent variables that determine when to execute component processes and 
which processes to execute have great Impact on problem solving. For example, 
problem-solving strategies are an Important concept in problem-solving theory 
(Davis, 1973; Newell & Simon, 1978). Similarly, problem-solving strategies have 
long been thought to be major aspects of individual differences in intelligence, 
particularly for those theories of intelligence that emphasize adaptability 
(Plntner, 1921; Sternberg, 1979; Woodrow, 1921). Thus, on theoretical 
grounds, a complete model of information processing on intelligence test items 
should Include strategy variables. 

The MLTM that is presented in Table 2 postulates that individuals have 
equal likelihoods of applying the various strategies. That is, the strategy 
application parameter a, and the parameter for successful application of other 
strategies, do not vary over persons or items. As suggested above, this as- 
sumption is unwarranted on both theoretical and empirical grounds, since meta- 
components are known to Influence task performance. Thus, to fully represent 
processing,, the strategy application parameters need to vary over persons or 
items. 

A metacomponent latent trait model would Include strategy application pa- 
rameters for persons or items. However, estimation of these parameters will be 
complex. Returning to Table 2, it should be obvious that the symbolic partial 
derivative with respect to a^, for example, will not be simple if variability for 
either persons or items is Included in the model. That is, the estimation of 
such parameters will depend on the outcome of the other parameters, and b^. 



14 



- 308 - 



The parameter £ can only be estimated from response patterns in which both com- 
ponents are correct, as in the first two response patterns in Table 2. Not only 
will the estimation algorithm necessarily be complex, but also estimation error 
will vary as a function of the other component re mouses. For example, for per- 
sons with few accurate component outcomes, the ^.arameter will not be estimated 
reliably, since little information about the parameter will be available. 

An obvious question at this point is the potential utility of developing 
the estimators for the more complex metacomponer.t models. 'R/o questions about 
metacomponent parameters need to be addressed: (1) Dd items and persons vary in 
propensity for applying the various cctnponents? and (2) Do individual differ- 
ences in metacomponents contribute to the criterion-related validity of an apti- 
tude test? To answer these questions, data from two studies on verbal aptitude 
(Whitely, 1980, 1982) were reanalyzed to include the metacomponent parameters. 

A reanalysis of data originally collected by Whitely (1980) shows the vari- 
ability among items and persons in two metacomponents that could be estimated 
with the latent trait model in Table 2. These are application of the rule- 
oriented strategy, £, and application of other strategies, such as guessing, 
Figure 3 shows frequency distributions of the _a parameters for two item types, 
verbal analogies, and verbal clcsrif ications. In this analysis, £ was computed 
as the conditional probability that examinees would solve the total item when 
the component information was available. Admittedly, this estimator of a is 
crude, but it does provide at least some indication of its nature. It c"an be 
seen in Figure 3a that the parameter values tend to be high on both item types 
but that examinees vary widely on the parameter. It is not clear to what extent 
this distribution reflects differing degrees of accuracy of estimating a for 
individuals. ~" 

Figure 3b shows the distribution of the jg^ parameter for examinees, computed 
as the conditional probability of solving the total item, given that the compo- 
nent information is not available. It can be seen that this value centers 
around .50 for both item types and that individuals vary widely in these values. 
Thus, for both strategy application and guessing, some individual differences 
are indicated. Figure 4 and Figure 5 present stem and leaf distributions of £ 
and parameters, respectively, for items. As for individuals, considerable 
variability is indicated. 

A second study (Whitely, 1982) contains data on the contribution of meta- 
component variables to test validity. The Whitely study examines the relation- 
ship of individual differences in strategy application to a major criterion for 
aptitude test validity, educational achievement. In this study, data on the 
achievement of 99 parochial high school students were collected, in addition to 
their performance on an analogical reasoning test and on several subtasks that 
represented components and metacomponents in solving analogies. 

The contribution of strategy application parameters (i.e., a) and other 
strategies (£) were examined in separate analyses. In the Whitely (1982) analy- 
ses, individual differences in strategy application were examined for two strat- 
egies that led to analogy solving. There were (1) a rule-oriented strategy and 
(2) a response elimination strategy. The contribution of the strategy applica- 
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Figure 3 

Frequency Distribution of Application (a) and Guessing (£> 
Probabilities for Examinees on Two Item Types 
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Figure 4 

Stem and Leaf Distribution of Application (a) 
Probabilities for Two Item Types 
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tion paraneter to test validity was examined by structural eqiiation models 
(Joreskog, 1974 )• 

In these models , individual differences in both applying and performing the 
components of the strategies were measured as Independent variables. The depen- 
dent variables included performance on the analogical reasoning test as well as 
scores on eight area achievement tests. For both the rule-oriented strategy and 
the response elimination strategy, it was found that adding strategy application 
to the strategy performance variables significantly increased prediction of both 
analogical reasoning and achievement. The differences that were obtained by 
adding the strategy application variables^to the covariance models were highly 
significant for both the rule-oriented (X ■» 65.57, £< .01) and the response 
elimination strategy (X^ " 65»57, £ < .01). For both strategies the application 
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Figure 5 

Stem and Leaf Distribution of Guessing (^) 
Probabilities for Two Item lypes 
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variable was significantly related to analogical reasoning (t « 7.5S and 2#07, 
respectively > for the rule-oriented and response elimination strategies), show- 
ing that strategy application is an important metacomponent for individual dif- 
ferences in analogical reasoning* 

Table 4 presents data on the contribution of the application parameters to 
the prediction of achievement in several areas* Indices that are comparable to 
multiple regression analyses were obtained from the structural equation analy- 
ses* Ibr each of the strategies and for the tw) strategies combined with a 
guessing strategy, Table 4 shows the F value for the application metacomponent 
and its incremental contribution to explaining variance of each achievement 
test, as well as the proportion of variance explained* The application meta- 
component for the rule-oriented strategy significantly contributed to the valid- 



id 

ERLC 



19 



- 312 - 



ity for predicting Mathematics and Sources* The application metacomponent for 
the response elimination strategy significantly increased prediction for several 
achievement areas, including Reading Comprehension, Vocabulary, Language Use, 
Spelling, Social Science, Science, and Sources. 



Table \ 

Contribution of Metacomponent and Strategy 
Parameters to Predicting Achievement 



Strategy 
and 
Achievement 
Area 


Specification 
Accuracy 
Multiple R 


Contribution of 
Metacomponent 
Reduction of 
Fg Error (Ar2) 


Rule-Oriented Strategy 








Reading Comprehension 


.67 


.28 


.01 


Vocabulary 


.50 


.02 


.01 


Language Use 


.67 


2.01 


.02 


Spelling 


.38 


.05 


.00 


Mathematics 


.52 


4.8A* 


.06 


Social Science 


.51 


1.A8 


.02 


Science 


.74 


.01 


.00 


Source 


.66 


9.18** 


.09 


Response Elimination Strategy 








Rpadinff ^fJTnnl•phPTlfl^fJT1 


71 
. / 1 


5.42* 


.05 


Vocabulary 


.66 


14.21** 


.14 


Language Use 


.73 


6.81* 


.06 


Spelling 


.52 


8.35** 


.11 


Mathematics 


.50 


.55 


.01 


Social Science 


.57 


8.58** 


.10 


Science 


.77 


5.71* 


.04 


Source 


.71 


17.64** 


.15 


Guessing Strategy 








Reading Comprehension 


.60 


9.93** 


.12 


Vocabulary 


.63 


8.44** 


.09 


Language Use 


.58 


5.50* 


.07 


Spelling 


.48 


.05 


.00 


Mathematics 


.69 


2.61 


.03 


Social Science 


.60 


6.80* 


.08 


Science 


.71 


15.20** 


.14 


Source 


.68 


10.02** 


.10 



*£ <.05. 

**£ <.01. 



Table A also shows that f»dding the jg^ parameter to the rule-oriented strate- 
gy and tfhe response elimination strategy^ significantly increased the prediction 
of achievement in several areas. Thus, these data support the potential of in- 
dividual differences in metacomponent variables as an important aspect of test 
validity. The metacomponent variables increased the prediction of achievement 
,over the simple component performance variables. 
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The main conclusion to be drawn from these studies is that metacomponent 
latent trait models are needed to estimate more fully the processing abilities 
that underlie aptitude. Although the estimation of the metacomponent parameters 
will be complex, even the crude estimators that were used in the studies de- 
scribed above show clear contributions to aptitude test validity. 

Conclusions 

This paper has presented latent trait models that can be used for test de- 
sign in the context of a theory about the variables that underlie task perfor- 
mance. Examples of methods for decomposing and testing hypotheses about the 
theoretical variables in task performance were given. The methods can be used 
to determine the processing components that are involved in item performance. 

Three component latent trait models for underlying theoretical variables 
were described along with their maximum likelihood estimators. The item parame- 
ters can be used for item banking, according to the influence of the underlying 
processing variab?.es on item difficulty. Such estimators permit the test devel- 
oper to choose itens that represent specified information processing demands for 
the examinee. That is, the test developer can select items that are difficult 
on some processes, but easy on others. In this manner, what is measured by an 
aptitude test can be explicitly designed by specifying difficulty levels in the 
underlying processing components. 

The need for metacomponent latent trait models was also considered. It was 
shown that both items and persons vary on metacomponent parameters and that 
these parameters are Important for the predictive validity of an aptitude test. 
Thus, metacomponent latent trait models should provide a better estimate of the 
abilities that are involved in test performance. 
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